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^1- 



Let / : {1,2,..., N} — > [0, oo) be a non-negative signal, defined over a very large domain 



^+ ■ and suppose that we want to be able to address approximate aggregate queries or point 

queries about /. To answer queries about /, we introduce a new type of random sketches 
f- ^ ■ called max-stable sketches. The (ideal precision) max-stablc sketch of/, Ej(f), 1 < j < K, 

f^i | is defined as: 

Ei (f) := max^ /(*)^(»), 1 < J < K, 

where the KN random variables Zj(i)'s are independent with standard a— Frcchet dis- 
r%J ' tribution, that is, ¥{Zj(i) < x} = cxp{— x~ a }, x > 0, where a is an arbitrary positive 

parameter. Max-stable sketches are particularly natural when dealing with maximally 
updated data streams, logs of record events and dominance norms or relations between 
large signals. By using only max-stable sketches of relatively small size K « N, we can 
compute in small space and time: (i) the t a — norm, a > 0, of the signal (ii) the distance 
between two signals in a metric, related to the t a — norm, and (in) dominance norms, that 
is, the norm of the maxima of several signals. In addition, we can also derive point queries 
about the signal. 

As is the case of p— stable, < p < 2, (sum-stable) additive sketches, sec Indyk (2000), 

max-stability ensures that E 3 {f) = ||/||/ a Zi(l), with ||/|| £q = (£i<i<jv f{i) a ) 1/a , where 

= means equal in distribution. We derive e — 6 probability bounds on the relative error for 
distances and dominance norms. This can be implemented by efficient algorithms requiring 
small space even when the computational precision is finite and a limited amount of random 
bits are used. Our approach in approximating dominance norms improves considerably on 
existing techniques in the literature. 



1 Introduction and motivation 

Random sketches have become an important tool in building unusually efficient algorithms 
for approximate representation of large data sets. One of their major applications is to data 
streams. To put our work into perspective, we start by describing briefly the data streaming 
context. We then list some of the major contributions and discuss our results. 

Consider an integer-valued signal / : {1, . . . , N} — > {— M, . . . , M}, defined over a "very 
large" domain, so that it is not feasible to store and/or process it in real time. The signal is 
updated or acquired sequentially in time, starting with the zero signal at time zero. Following, 
for example, Gilbert et al. (20016) (see also Muthukrishnan (2003) and the seminal work of 
Henzinger, Raghavan and Rajagopalan (1998)), we focus on two streaming models: (i) cash 
register, and (ii) aggregate. In case (i), data pairs (i, a(i)) are observed successively (in arbitrary 
order in i) and on each data arrival, the i— th component of the signal is updated incrementally 
(like a cash register): f(i) := f(i) + a(i). In case (ii), the data pairs (i,f(i)) are observed 
directly (again in arbitrary order in i). In this case, a given index i appears at most once and the 
corresponding /(i)'s are not updated incrementally multiple times. Model (i) is more general 
and more widely used. Both models, however, have found important applications in many 
areas such as on-line processing of large data bases, network traffic monitoring, computational 
geometry, etc. (see, e.g. Gilbert, Kotidis, Muthukrishnan and Strauss (2002, 2001a), Cormode 
and Muthukrishnan (20036), Indyk (2003)). Much of the work on sketches was motivated by 
the seminal paper of Alon, Matias and Szegedy (1996). For a detailed review of methodologies 
and applications in this emerging area in theoretical computer science see Muthukrishnan 
(2003). 

Random sketches are statistical summaries of the signal /, which can be updated sequen- 
tially (as the stream is observed) using little processing time, processing space and computa- 
tions. Many algorithms involving random sketches have been proposed, see e.g. Muthukrishnan 
(2003) and the references therein. They provide as a common feature, approximations to var- 
ious queries (functionals) on the signal /, within a factor of (l±e), with probability at least 
(1 — 5), where e > and 5 > are "small" error and probability parameters chosen in advance. 
Typically, this is realized by algorithms consuming an amount 

O ( log 2 M(log 2 A0 o(1) ln(l/o")/e° (1) 

of storage, and even smaller order of processing time per stream item /. Here M denotes the 
size of the range of the values a(i) in the cash register model or f(i) in the aggregate model. In 
many applications, one may be willing to sacrifice deterministic approximations at the expense 
of stochastic approximations, which are valid with high probability and are easy to compute. 
Indyk (2000) has pioneered the use of p— stable distributions in random sketches (see also, 
Feingenbaum, Kannan, Strauss and Viswanathan (1999), Fong and Strauss (2000), Cormode 
(2003) and Cormode and Muthukrishnan (2003a)). The p-stable (0 < p < 2) sketch of the 
signal / is defined as: 

TV 

Sj(f)-=52f(i)XM J = 1,-..,K, (1.1) 

where the KN random variables Xj(i) are independent with a p— stable distribution. The 



stability (sum-stability, see the Appendix) property of the p— stable distribution implies that 

N 1/ 

^(/) = (El/«l P ) P X 1 (l) = \\f\\ tp X 1 Q), 3 = 1,-,K, (1.2) 



j=i 



d 



where = means equal in distribution. Also, by the linearity of the inner product (11. lh . the 
sketch Sj(f), j = 1, . . . ,K of / can be updated sequentially, in both, cash register and aggregate 
streaming models with 0{K) operations per pair (i,a(i)) or (i,f(i)), where the X,(i)'s are 
generated efficiently, on demand (see below). In his seminal work, Indyk (2000), used p = 1 
(Cauchy distributions for the Aj(i)'s) and the median statistic 

median{ |^(/) |, 1 < j < K} 

to estimate the norm \\f\\e 1 := S»=i 1/(01 °f the signal. It was shown that for any e > and 
5 > 0, the norm \\f\\e 1 is estimated within a factor of (1 ± e) with probability at least (1 — 5), 
provided that 

jr>o(iiog(i/a)). (i.3) 

Moreover, by using the results of Nisan (1990), these estimates were shown to be realized with 
C(log 2 M\og 2 (N/5) log(l/<5)/e 2 ) bits of storage, needed primarily to store truly random bits or 
seeds for the pseudorandom generator. Roughly speaking, these seed bits are used to efficiently 
generate any one of the KN variables Zj(i)'s on demand, when a data pair (i, a(i)) or (i, f(i)) 
is observed. 

Exploiting the linearity of sketches and the properties of the stable distributions, Indyk 
(2000) also developed approximate embeddings of high-dimensional signals / £ £^ in i^ 1 , 
where m « N. This allowed efficient approximate solutions to difficult nearest neighbor 
search algorithms in high dimensions, see e.g. Datar et al. (2004). Other authors also used 
stable distributions to construct efficient stochastic approximation algorithms, see e.g. Cormode 
and Muthukrishnan (2003a). 

Here, we propose a novel type of random sketches, called max-stable sketches. Namely, 
consider a non-negative signal / : {1, ...,N} — > [0,oo). The a— max-stable sketch of / is 
defined as: 

Ej-(f) := max /(^(i), j = l,...,K, (1.4) 

l<i<N 

where the KN random variables Zj(i) are independent standard a— Frechet , that is, 

F{ Zj (i) <x} = Q a (x) := { e ^~ x ^ • * > (1.5) 

and where a > is an arbitrary positive parameter. 

The max-stable sketches can only be maintained in the aggregate streaming model, that 
is, when any given index value i is observed at most once in the stream. This is so because the 
operation "max" is not linear. Indeed, if the signal values f(i) were incremented sequentially 
(as in the cash register model), then to be able to update the max-stable sketch of /, one would 
have to know the whole signal thus far, which is not feasible. Other than the aggregate model, 



a natural streaming context for max-stable sketches is when the cash register is updated in a 
max-incremental fashion: 

f(i) := max{/(f),a(i)}. 

In this setting, max-stable sketches can be maintained sequentially. 

In the spirit of p— stable sketches, the max-stability of the Zj(i)'s implies (see the Ap- 
pendix) : 

N i / 



^■(/) = (£/W a ) ZiO) = WfW^ZiO), j = i,...,k. 



t=i 



Therefore, as in Indyk (2000), for any e > and 5 > 0, we can estimate the norm ||/||^ a within 
a factor of (1 ± e), with probability at least (1 — 5), if 

K>o(^ln{l/5) 

Following the ideas in Indyk (2000), we show by using results of Nisan (1990), that this can 
be realized with real algorithms of space 

o(log 2 (M)log 2 (iV/5)ln(l/<5)/e 2 ), and C?(log 2 (M) ln(l/5)/e 

processing time per stream item (i,f(i)). Note that a > can be chosen arbitrarily large, 
whereas one is limited to < p < 2 in the p— stable case. Since the max-stable sketches are 
non-linear, being able to approximate ||/||£ a , for any a > 0, does not imply approximation of 
the distance ||/ — g\\i a in the £ a — norm of two signals based on their sketches. Therefore, our 
results do not contradict the findings of Saks and Sun (2002). The recent paper of Indyk and 
Woodruff (2005) provides algorithms for approximating t a — norms for a > 2 which essentially 
match the theoretical lower bounds on the complexity in Saks and Sun (2002). The strengths 
of max-stable sketches lie in approximating max-linear functionals. 

One of the key advantages of max-stable sketches is in handling dominance norms. Cor- 
mode and Muthukrishnan (2003a), consider the problem of estimating the norm of the domi- 
nant of several signals, that is, dominance norms. Given non-negative signals f r (i), 1 < i < N, 
r = 1, . . . ,R, the goal is to estimate the norm ||/*||£ Q , where 

f*(i) := max fJi), 1< % < N. (1.6) 

V ; l<r<R K li - - \ ) 

Such type of problems are of interest when monitoring Internet traffic, for example, where 
f r (i) stands for the number of packets transmitted by IP address i in its r— th transmission. 
The signal /* then represents worst case scenario in terms of traffic load on the network and 
its norm or various other characteristics are of interest to network administrators. Other 
applications of dominance norms arise when studying electric grid loads and in finance (for 
more details, see Cormode and Muthukrishnan (2003a) and the references therein). A novel 
area of applications of max-stable sketches arises in privacy, see Ishai, Malkin, Strauss and 
Wright (2006). 



Suppose that we only have access to the max-stable sketches Ej(f r ), 1 < j < K of the 
signals f r in (jl.6p . In view of max-linearity, we then have 

£,(/*)= max .£,-(/,.), 1<J<A. 

Kr<« 

That is, one can recover the max-stable sketch of the dominant signal /* by taking component- 
wise maxima of the sketches of the signals f r , 1 < r < R. Therefore, the quantity ||/*|| a , 
which is the dominance norm of the signals f r , 1 < r < R, can be readily estimated from 
the sketch Ej(f*) by using medians or sample moments. Moreover, this can be done with 
precision within a factor of (1 ± e) and with probability at least (1 — 5), provided that 
K > C(ln(l/(5)/e 2 ). In practice, one deals with finite precision calculations and pseudo- 
random number generators of bounded space. In this setting, as in the case of approximating 
plain norms, one can compute dominance norms by using an algorithm with processing space 
C(log 2 (M)log 2 (iV/5)ln(l/5)/e 2 ), and £>(log 2 (M) ln(l/<5)/e 2 ) per item processing time. Our 
approach consumes less randomness, space and improves significantly on the processing time 
in the method of Cormode and Muthukrishnan (2003a) (see Theorem 2 therein). 

In addition to norms and dominance norms, one can use max-stable sketches to recover 
large values of the signal exactly. We construct a point estimate f(io) of f(io), io € {1, . . . , N}, 
based on an ideal precision a— max-stable sketch of size K, such that for any e > and 5 > 0, 

P{/(to) = /(*o)}>l-$ if /(*o) > e||/k and K > ^^. (1.7) 

Real algorithms of space 

0(log 2 (DN/5e)log 2 (N/5)log 2 (l/5)/e a ) = 0((log 2 (DN/Se)f^ log 2 (l/8)/e a ) 

and smaller order of per stream item processing time exist. Here D = l+/(*o)/(Si^i /( ? ) a ) 
and the signal / is supposed to take integer values. Observe that one can easily maintain the 
largest l/e a values of the signal exactly. Although, our method does not improve on the naive 
approach, the proposed estimator may be useful when the signal is not directly observable but 
its max-stable sketch is available. This is particularly useful in applications related to privacy, 
see the forthcoming paper of Ishai, Malkin, Strauss and Wright (2006). 

Important ideas exploiting min-stability have been successfully used in the literature. Co- 
hen (1997) assigns to the items of a positive signal independent Exponential variables with 
parameters equal to the signal values. The minima of independent exponentials is an exponen- 
tial variable with parameter equal to the sum of the parameters of the individual components. 
Therefore, by keeping only the minima of such exponential variables corresponding to certain 
ranges of the signal values, one can estimate the sum of the signal values in these ranges. This 
can be done efficiently, in small space and time, by taking independent copies of such minima, 
see Theorem 2.3 in Cohen (1997). This approach can be viewed as a dual approach to that of 
the max-stable sketches. Indeed, the reciprocal of an Exponential variable is a— Frechet with 
a = 1. We provide here a more general framework where a can be arbitrary positive parameter 
and therefore we can estimate not only sums but £ a — norms. In fact, going a step further, we 
estimate efficiently dominance norms of several signals and show that relatively large values 
can be recovered exactly with high probability. 



Alon, Duffield, Lund and Thorup (2005) suggest an interesting random priority sampling 
scheme. It assigns random priority qi = Wi/Ui, to an item i which has weight Wi > 0, where 
the t/j's are independent uniformly distributed random numbers in (0, 1). In our scenario Wi 
corresponds to the signal value f(i). However, instead of taking maxima of the priorities qi 
over i, these authors consider the top— k largest priority items. By using a statistic, relative 
to the (k + 1)— st largest priority, they can estimate efficiently and with high probability the 
sum of the weights W{ for relatively small fc's. This is an interesting approach and it differs 
from that of the max-stable sketches in two major aspects: (i) the random priorities qi have 
Pareto distribution with heavy-tail exponent 1, whereas we employ Frechet distributions to be 
able to use their max-stability property; (ii) in priority sampling, one keeps the top— A: values, 
whereas max-stable sketches keep different realizations of the maximal "priority" . The second 
difference between max-stable sketches and the priority sampling scheme of Alon, Duffield, 
Lund and Thorup (2005) is crucial since the top— A; priorities are dependent random variables. 
This fact, we believe, makes the rigorous analysis of the variance in the priority sampling 
scheme rather difficult (see Conjecture 1 in the last reference). Nevertheless, priority sampling 
involves generating only N independent realizations, where N is the size of the signal. It is 
thus computationally more efficient than our method and the method of Cohen (1997) (see 
also Section 1.5.6 in Alon, Duffield, Lund and Thorup (2005) for a discussion). 

In summary, the max-stable sketches proposed here are natural when dealing with dom- 
inance norms and £ a — norms for arbitrarily large a > 0. Their properties, moreover, can be 
established precisely and rigorously related to the nature of the signal /. Max-stable sketches 
complement and improve on existing techniques, and can offer a new "non-linear" dimension 
in stochastic approximation algorithms. 

2 Approximating £ a — norms and distances 

We show here that max-stable sketches can be used to estimate norms and certain distances 
between two signals. For simplicity, wc deal here with ideal precision sketches. In an extended 
version of the paper we show that efficient real algorithms exist by using the results of Nisan 
(1990). 

We first focus on estimating the norm \\f\\e a = (^^ f{i) a ) 1 ' a from the a— max-stable sketch 
Ej(f), 1 < 3 < K of / (see (I1.4J) ). Introduce the quantities 

w/H r(1 _' /a)Jf £;w)' /r 

.3=1 

for some < r < a, and 

i a ,med(f) ■= (ln2) 1 / a median{^(/), 1 < j < K}, 

see (|4.13p and (|4.14p below for motivation. By using the max-stability property of a— Frechet 
distributions, we obtain the following result. 

Theorem 2.1 Let e > and 5 > 0. Then, for all < r < a/2, we have 

P{|4,r(/)/||/k - 1| < e} > 1 - S, and P{|4,mcd(/)/||/lk - 1| < e} > 1 - 5, 



provided that K > Clog(l/<5)/e 2 . Here the constant C > depends only on a and r, in the 
case of the £ a ,r(f) estimator. 

The idea of the proof is given in the Appendix. It relies on the fact that Ej(f) r have finite 
variances for all < r < a/2 and uses the Central Limit Theorem. More general results where 
a/2 < r < a will be given in an extended version of the paper. 

The last result indicates that the quantities £ a ,r(f), < r < a/2 and ^med approximate 
\\f\\e a up to a factor of (1 ± e) with probability at least (1 — 5). This can be achieved with an 
ideal-precision sketch of size K = Oi\og(l/5)/e 2 ). 

We now focus on approximating distances. Consider two signals /, g : {1, . . . , iV} — > [0, oo) 
and let Ej(f), Ej(g), j = 1, . . . , K be their ideal precision max-stable sketches. Observe that 
the max-stable sketches are non-linear and therefore even if f(i) < g(i), 1 < i < N, the 
sketch Ej(g — f) does not equal Ej(g) — Ej(f). Nevertheless, one can introduce a distance 
between the signals / and g, other than the norm ||/ — g\\j> a which can be computed by using 
the sketches Ej(f) and Ej(g). 

Consider the functional 

Pa(f,g) ■= \\f a - s a lk = E l/W a - 9(i) a \- 

i 

One can verify that p a (f,g) is a metric on IR+. This metric, rather than the norm ||/ — g\\i a , 
is more natural when dealing with max-stable sketches (see, Stoev and Taqqu (2005)). 
Observe that 

\\r-g a \w = E(/w a v^r-/(in + ^(/(irv 5 (ir- 5 (ir) 

i i 

911 f \/ a\\ a — II f\\ a — \\a\\ a 

— Z \\J v 9\\e a \\J\\e a \\9\\e a - 

By the max-linearity of max-stable sketches, we get Ej(/ V g) = Ej(f) V Ej(g) (see (I4.12p . 
below). Therefore, the terms in the last expression can be estimated in terms the estimators 
£ a ,r(f) and £ a ,med(f) above. Namely, we define 

p a ,r(f,9) ■= 2l*,r(J V g) a - i a ,r(fT ~ ^rOT , 

for some < r < a, and 

Pa,med(f,g) '■= 24,mcd(/V9)" ~ £ a ,mcd(f) a ~ £a,mcd(g) a ■ 

Theorem 2.2 Let e > 0, 5 > and n > 0. If 

p a (f,g)>v\\fvg\\Z, (2-1) 

then, for all < p < a/2, we have 

v{\p(f,g)/p(f,g) - 1| < o(e/n)} > 1 -36, (2.2) 

provided that K > Cln(l/(5)/e 2 . Here the constant C > depends only on a and r, in the case 
of the p a ,rU ', g) estimator. Herep(f,g) stands for either p a ,med(f , g) orp a>r (f,g). 



The idea of the proof is given in the Appendix. The condition (|2,ip is essential. Indeed, by 
taking indicator signals f{i) = lyi(i) and g(i) = 1_b(z), we get that 



Therefore, if condition of type (|2.ip was not present, one would be able to efficiently estimate 
the intersection of the two sets A and B with small relative error, which is proved to be a hard 
problem (see, Razborov (1992) and also Section 4 in Cormode and Muthukrishnan (2003a)). 

3 Approximating dominance norms 

Let now f r (i), 1 < i < N, r = 1, . . . ,Rbe R non-negative signals defined over large domain. 
Our goal is to approximate their dominance £ a — norm, that is, the norm ||/*||f Q , a > 0, of the 
signal 

f*(i) := max f r (i), l<i<N. 

l<r<R 

As argued in the introduction, such problems arise in Internet traffic monitoring, electric 
grid management and also in finance. The seminal paper of Cormode and Muthukrishnan 
(2003a) addresses the problem of dominance norm estimation in the special case a = 1. It was 
shown therein that the problem has a small space and time approximate solution, valid with 
high probability. The main tool used used in the last work are p— (sum) stable sketches of the 
data where the stability index p > is taken to have very small values. Here, we propose an 
alternative solution to the dominance norm problem, which is superior in terms of time and 
space consumption and also works when dealing with || • \\i a for an arbitrary a > 0. In the end 
of this section, we also show the connection between our approach and that of Cormode and 
Muthukrishnan (2003a). 

Let Ej(f r ), 1 < j < K be the a— max-stable sketches of the signals f r , r = 1, . . . , R. By 
max-linearity of the max-stable sketch: 

E 3 U*) = max Ejifr), Vj, (3.1) 

l<r<R 

where f*(i) = maxi< r </j / r (i), 1 < i < N. 

Hence, by using sample medians for example, we get 

dom maXiQ (/i, ...,f R )-= \\f*\\ ia ps £ a ,mcd(f*) 
= (ln2) 1/Q median{V r £' j (/ r ), 1 < j < K}. 

Our first results on f a -norm approximation imply: 
Theorem 3.1 Let e > and 5 > 0. For all < r < a/2: 

P{|V(/*)/ll/1k> - 1| < e} > 1 - <5, and P{|4, me d(/*)/||/k - 1| < e} > 1 - 6, 

provided 

K>C\og(l/5)/e 2 . 

Here the constant C > depends only on a and r, in the case of the t a r(f*) estimator. 



The proof of this result follows from the max-linearity of the max-stable sketches (see (|3.ip ) 
and Theorem 1 2 . 1 1 ab ove . We now make some remarks on the differences between our approach 
and that of Cormode and Muthukrishnan (2003a). 

Remarks: 

• Note that the statement of Lemma 1 in the last reference is mathematically incorrect. 
One cannot have a in the right-hand side since limit has been taken as a — > 0+ in the 
left-hand side therein. The correct statement is as follows: 

Let £ a be symmetric a— stable random variables with constant scale coefficients a > 0. 
Then, as a — > 0+ ; we have 

where Z is a standard 1—Frechet random variable, that is, ¥{Z < x} = exp{ — 1/x}, x > 
0. Observe that Z = 1/X , where X is an Exponential random variable with mean 1. See 
Exercise 1.29, p. 54 in Samorodnitsky and Taqqu (1994). 

Therefore, the continuity of the cumulative distribution function of the limit Z implies 
that the medians of the distributions of |£ Q | Q converge, as a — > 0+, to the median of 
Z which is l/ln(2) (see (|4.14p below). (Note that here we use a as in Cormode and 
Muthukrishnan (2003a) whereas the parameter a plays a different role in Relation (|4.14p 
below.) 

• The method of Cormode and Muthukrishnan (2003a) uses p— (sum)stable sketches with 
very small p > 0. The p— stable distributions involved in these sketches have infinite 
moments of all orders greater than p and in practice take extremely large values. This 
poses a number of practical challenges in storing and in fact precisely generating these 
random sketches. Our method does involve heavy-tailed random variables but they are 
not as extremely heavy-tailed and have good computational properties. Furthermore, 
Frechet distributions can be simulated more efficiently than sum-stable distributions 
(see the Appendix). Therefore, in practice we expect our method to be more robust than 
the one of Cormode and Muthukrishnan (2003 a). 



• 



The storage and per item processing times of our method are significantly less than those 
of Cormode and Muthukrishnan (2003a). 



4 Answering point queries with max— stable sketches 

Max-stable sketches can be also used to recover relatively large values of the signal exactly with 
high probability. This problem has in fact a deterministic solution by using a naive algorithm 
in small space and time. Namely, as the signal is being observed (in the aggregate or time 
series model) we simply maintain a vector of the top— K largest values. Max-stable sketches 
however, can be very helpful if no direct access to the signal is available either due to security, 
computational, power or privacy restrictions (see Ishai, Malkin, Strauss and Wright (2006)). 

We first present the main ideas using ideal algorithms which assume infinite precision 
and random variables which can be perfectly generated. We then remove these idealizations. 



Consider a non-negative signal / : {1, . . . , N} — > [0, oo). Let a > and let Ej(f), j = 1, . . . , K 
be an ideal a— max-stable sketch of / denned in (jl,4p . Given an i$ G {1, . . . , TV}, set 

c \ E i(f) max 1<i<N f(i)Zj(i) . 
* (?0):= ^) = ^) ' J = 1 '-' *' (41) 

and define the point query estimate f(io) as the smallest of the gj(io), j = 1,...,K. If 
9(j)(io)i J ' = 1) ■ ■ ■ i K denote the sorted gj(ioYs: 

5(i)(«o) < 9(2)(k) <■■ < 9(K)(k), 

then the point query estimate f(io) is 

/(*o) : = 9(1) (io) = mm Oj(io). (4.2) 

We also introduce the following criterion: 

1 , if 5(1) (*o) =9(2) {k) 



criterion(io) : — 

V J I , if 5(1) (^o) <0(2)(«o)- 

which serves as a proxy for f(io) = f(io), as indicated in the following theorem. 

Theorem 4.1 Let e G (0, 1), S > and z E {1, . . . ,iV}. 

(i) I//(«'o) > e\\f\\ £a and K > ln(l/5)/e a (see IjTfy ). tfien 

P{/(«o) = /(*o)}>1-& (4-3) 

(ii) (a) criterion(io) = 1 implies f(io) = f(io)- 
(ii) (b) If for some 9 > 0, 

/(to) > ell/k and K > max{3, 2C, ln(2/<5)/ e Q + e }, 

w/iere C e = 0(l/9 1+e/a ) is given in (|4T22JI . i/ien 

P{criterion(i ) = 1} > 1 - S. (4.4) 

We now address the algorithmic implementation of the point query problem and its crite- 
rion. This is more involved now than in the case of norms and therefore we present a detailed 
argument here. Following Indyk (2000), suppose now that the signal is only of finite precision 
e.g. 

/:{l,...,iV}^{0,l,...,L} 

and suppose, moreover, that our pseudorandom numbers can only take values in the set 
Vl '■= {p/li Pi Q € {0, 1, . . . , L}, q ^ 0}. Let Uj(i) be infinite precision independent uniform 
random numbers in (0, 1). We shall base our algorithms on discretized versions of the ideal 
standard a-Frechet variables Zj(i) := ^ , ~ 1 (Z7 7 («)), where Q^iv) '■= (ln(l/£/))™ 1//ct , y G (0,1) 



10 



is the inverse cumulative distribution function of a standard a— Frechet variable. Fix a small 
parameter 7 > (to be specified), and suppose that 

tf,-(i)e[7,l-7], for alH = l,...,iV, j = l,...,K. (4.5) 

This is not a limitation since this happens with probability at least (1 — 27) which is at 
least (1 - 5) provided that 7 < S/(4KN) = 0(5/KN). This is so because | ln(l - 2 7 )| < 4 7 , 
V7 G (0, 1/4) jtnd since | ln(l - S)\ > S, V5 G (0, 1). 

Let now Lj(i) be a multiple of 1/L, nearest to Uj(i), which is also in the set [7, 1 — 7] nVr. 
Let Z*(i) := <&~ l {Uj{i)) and let Zj(i) be a multiple of 1/L in the set Vl, nearest to Z*(i). 
Observe that \Zj(i) — ZUi)\ < 1/L and, as in Indyk (2000), by the mean value theorem, 

\Z 3 (i)-Z](i)\<\ sup d*-\y)/dy =o(—^—). (4.6) 



L j/e[7,l-7] 



L 7 i+lA 



and therefore, 

|Z,(i) - Zj(i)\ < p := 0(1/L 7 1+1 / Q ), for all i = 1, . . . ,N, j = 1, . . . ,K. (4.7) 

Theorem 4.2 Let e G (0, (a /(a + l)) 1/o2 ), 5 G (0, 1) and D > 0. Suppose that 

2(1 + C a )e( 2 /W a ) 1/Q < /(io) < £»(E /«°) 1/a ' ( 4 - 8 ) 

l<i<N ijti 

where C a = a(l + l/a) 1+1 / a e~( 1+1 / a ). 

// i/ie precision /3 in f)4.Tf) is suc/i t/iat /3 < e a /{D + 1), i/ien i/iere exists an algorithm, 
implementing the point estimator f(io) so that 

nf(io) = /(*o)} > 1 - 3<5, (4.9) 

holds. This can be done in space 0(log 2 (DN/e5) log 2 (A r ) ln(l/5)/e Q! ) lozi/i i/ie same order 0/ 
bit-wise operations per stream item. 

The proof is given in the Appendix. 

The infinite precision was essential in proving that {criterion(io) = 1} implies {/(«o) = 
f(io)}- We cannot expect this when using real algorithms where the Zj(i)'s have finite preci- 
sion. The following result shows, however, that there is nevertheless an algorithm such that 
{criterion(io) = 1} implies that {f(io) = f(io)} holds with high probability, independently of 
the nature of the signal /. 

Theorem 4.3 Let the point estimator f(io) and its criterion be based on a max-stable sketch 
in terms of the finite precision variables Zj(i) as in (|4.T[) . If f3 < C(5/(K 2 [ln(NK 2 /5)] 1 ' a )), 
then 

P({/(*'o) + f(io)} n {criterion(io) = 1}) < 5, (4.10) 

where the constant C does not depend on the signal f. The last probability bound is also 
valid for an algorithm requiring storage 0((log 2 (N) ln(l/5)/e a ) c> ( 1 >) and the same order of 
operations per stream item. 
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The proof is given in the Appendix. 
Remarks: 

1. Relation (|4.10p shows that our criterion may falsely indicate that f(io) = f(io) only with 
small probability. 

2. Our point query and its criterion algorithms have features of both Las Vegas and Monte 
Carlo randomized algorithms. Namely, they give exact results, as Las Vegas algorithms 
do, however their computational time is fixed and sometimes (with low probability) they 
fail to give correct results. As in Monte Carlo algorithms, the probability of getting exact 
results grows with the size of the max-stable sketch. 

Acknowledgments 

We would like to thank Anna Gilbert, Martin Strauss and Joel Tropp for their remarks and 
encouragement. Martin Strauss helped us understand better the results of Nisan (1990). 

Appendix 

Background on max— stable distributions 

Definition 4.1 A random variable Z is said to be max-stable if, for any a, b > 0, there exist 
c > and deK, such that 

maxjaZ', bZ"} = cZ + d, (4.11) 

where Z' and Z" are independent copies of Z. 

This definition resembles the definition of sum-stability where the operation "max" is the 
summation. Recall that X is sum-stable if for any a, b > 0, there exist c > and d EM., such 
that 

aX' + bX" = cX + d. 

Both sum-stable and max-stable distributions arise as the limit distributions when taking sums 
and maxima, respectively, of independent and identically distributed (iid) random variables. 
For more details on sum-stable and max-stable variables, see e.g. Samorodnitsky and Taqqu 
(1994) and Resnick (1987). We will only review in more detail the class of a— Frechet max- 
stable variables. 

Definition 4.2 A random variable Z is said to be a— Frechet, for some a > 0, if 

H z <x} = exp{-a a x~ Q }, for x > 0, 

and zero otherwise (for x < 0), where a > 0. If a = 1, then Z is said to be standard a— Frechet. 
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We now list some key features of the a— Frechet variables 

• The parameter a plays the role of a scale coefficient. Indeed, for all o > 0, 

P{aZ <x} = P{Z < x/a} = exp{-(aa) a x- a }, x > 0, 

and therefore aZ is a— Frechet with scale coefficient aa. 

• One can check by using independence that (|4.1ip holds for any a— Frechet Z . More generally, 
let Z, Z(l), . . . , Z(n) be iid a— Frechet with scale coefficients a > 0, and let f(i) > 0. Then, 
by independence, for any x > 0, 

n 

P{Vi<i<„/(i)Z(») < x} = J] F{Zi < x/f(i)} = exp{- J2 f(i) a <r a x- a }, 

l<i<n i=l 

C:= \/ mm="& where ^ = (^/(,n 1/Q = ll/lk, 

l<i<n i 

and where Z is a standard a— Frechet variable. That is, the weighted maxima £ is an q— Frechet 
variable with scale coefficient <7g equal to ||/||f a . 

T/iis last property is one motivation to consider max-stable sketches. The max-stable 
sketch defined in (|1.4p can be viewed as a collection of independent realizations of an a— Frechet 
variable with scale coefficient equal to ||/||f a . 

• The max-stable sketches are max-linear. That is, if /, g € M^_ are two signals, then for any 
a, b > 0, we have: 

Ej{af V bg) = aEj(f) V bEj(g), for all j = 1, . . . , if. (4.12) 

Indeed, 

E j {afVbg)= \J {af{i)Vbg(i))Z 3 {i) = a \J f^Zj^Vb \f g(%)Zj{%) = aE^VbE^g). 

l<i<N l<i<N l<i<N 

• The a— Frechet variables are heavy-tailed. Namely, by using the Taylor series expansion of 
1 — e~ z , one can show that 

¥{Z > x} ~ a a x~ a , asx^oo, 

where a n ~ b n means a n /b n — > 1, n — > oo. Thus, the moments EZ P , p > of Z are finite only 
if < p < a. However, when < p < a, these moments can be easily evaluated. We have 



/■OO /'OO 

/ z p dexp{-a a z- a } = a p u~ p / a e~ u du = a p T(l - p/a), (4.13) 

Jo Jo 



where in the last integral we used the change of variables u = a a x a and where T(a) := 
J °° u a ~ 1 e~ u du, a > denotes the Gamma function. 
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• One can also easily express the median med(Z) of an a— Frechet variable Z. Indeed, ¥{Z < 
med(Z)} = 1/2 and by solving exp{— a a med(Z)~ a } = 1/2, one obtains: 

med(Z) = - ?—. . (4.14) 

v ' (ln2)V« v ; 

In Section [21 we used Relations (J4.13P and (|4.14p to estimate norms and distances of signals 
based on their max-stable sketches. 

• The a— Frechet variables can be easily simulated. If Uj, jGN are independent uniformly 
distributed variables in (0,1), then Zj := Q~ l (Uj) = (ln(l/E/ 7 -))~ 1 / a , j G N are independent 
standard a— Frechet. Indeed, for all x > 0, 

¥{(ln(l/U))- 1/a <x} = P{ln(l/C7) > x~ a } = ¥{U < e~ x ~ a } = e~ x ' a . 

Proofs for Sections [5] and 

Proof of Theorem 12.11 Observe that 

£g,r{fY _ 1 V~^ £ T 

T(l-r/a)K ^^' 



and 

(ln(2)) 1/Q median{£j, 1 < j < K}, 



-a.medu) 1 \\1/q 



where ^,-, j = 1, . . . , K are independent standard a— Frechet variables. 

Therefore, the result in the case of the sample-median based estimator follows from Lemma 
2 in Indyk (2000), for example, since the derivative of <fr~ l (y) = (ln(l/y)) _1 / Q at y = 1/2 is 
bounded. The result in the case of the moment estimator follows from the Central Limit 
Theorem, since the variables ££ — l/r(l — r/a) have zero expectations and finite variances. □ 

We will provide more detailed bounds in the above proof with absolute constants in an extended 
version of this paper. 

PROOF of Theorem 12. 2t We consider only p(f,g) = p a ,r(f,g)- The argument for the 
estimator p a ,med(f, g) is similar. Suppose that K is as in Theorem l2.ll so that with probabilities 
at least (1 -8), we have £ a>r (f) a = \\f\\f (1 + 0(e)), t a . T {g) a = Hsll? (1 + 0(e)), and £ a>r (f V 
g) a = \\f\/g\\l(l + 0(e)). 

Thus, by the union bound of probabilities, with probability at least (1 — 35) we have 

Pa,r(f,9) = 2||/ V g\\l(l + 0(e)) - ||/||£(1 + 0(e)) - \\g\\f a (l + 0(e)). (4.15) 

Now, since 

Pa(f,g)=2\\fWg\\l-\\f\\l-\\g\\l>ri\\fWg\\l> V m aX {\\f\\f a Ml}, 
we get, from (|4.15p . that 

p a , r (f,g) = (WvgWZ - WfWt - \\g\\t)(i + o(e/ v )). 

The last relation is valid with probability at least (1 — 35), which implies (I2.2p . □ 
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Proofs for Section [4] 

PROOF of Theorem SZU Observe that by ([Lip . (J4"1|) and (jM 



/(io) = min EjifyZjiio) = min /(*,) V ^(io)" 1 V /(0^i(0. 






(I 



where V denotes "max". Now Vi^i f{i)Zj(i) is independent of Zj(zq) = Zj{\) and, by max- 
stability (see Appendix), it equals in distribution (J2i^i f(i) a ) 1 Zj(2). Hence 

7(*o) = /(io) V min {^(l)- 1 ^ /(i) a ) 1 / a Z i (2)} =: /(t ) V min c / (z )Z 7 -(2)/Z 7 -(l), 

1<J<K I. ' — * J ISjS" 

(4.16) 
where Cf(io) = (X^i /00 a ) ■ By using again the independence in j, we get 



F{/(io) = /(to)} = P{/(io) > min C/ (*o)^(2)/^(l)} = 1 -P{/(i ) < C/ (i )Zi (2)^(1)}^ 

1<J<K 

= 1 - P{Zi(l)/Zi(2) < c/foV/fa))}*. (4.17) 

By Lemma |4~7L] the probability in (|4.17p equals 1/(1 + f(io) a /cf(io) a ), and hence 

f(io) a \ K = x _ ( Zi# f(i) a \ K = 1 _( 1 _ [Ml" K 



P{/(«o) = /(*'o)} = 1 ./. v a r v a ST fC\a 

(4.18) 
Now /(to) > e||/||<„ implies P{/(i ) = /(io)} > 1 - (1 - e a ) K . By choosing 6 > (1 - e a )^, we 
geti ; C>ln((5)/ln(l-e Q ). Since |ln(l-x)| > x, for all x € (0,1), we get that K > ln{l/5)/e a > 
ln(5)/ln(l — e a ), for all e G (0, 1), which completes the proof of part (i). 

We now prove part (ii) (a). Observe that 

(9(i)(io),9(2)(io)) = (f(k)VZ(i)(io))J(io)VZ(2)(k)), (4.19) 

where £(j)(io) < C(j+i)(*o)> i < -ftT — 1 is the sorted sample of independent random variables 
£j(io) := Cf(io)Zj(2)/Zj(l), j = 1, . . . ,K. Since the joint distribution of £(i)(io) and £(2)(*o) 
has a density, it follows that P{£(i)(io) = £(2) (to)} = 0- Hence, in view of (j4.19p . with proba- 
bility 1, we have 

{criterion (i ) = 1} = {3(i)(*o) = 3(2) (*o)} = {/(«'o) > £(2) (*'())}■ (4.20) 

The right-hand side of (|4.20p occurs only if {/(io) = /(*o)}> which completes the proof of (ii) 
(a). 

We now turn to part (ii) (b) and estimate P{criterion(io) = 1} = P{/(«o) > £(2)(^o)}- We 
have, 

P{/(»o) > £(2)(*o)} = P{/(»o) > c / (t )Z i (2)/Z i (l), for at least two j's} 

= 1 - (£)p* - (* y~\l -p) > 1 - {K + l)p K -\ 
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where p = F{f(io)Zi(l) < Cf(io)Zi(2)}. Reasoning as in part (i), we get by Lemma [4.14 
p = (1 - f(io) a /\\f\\1 ) < 1 — e a , since /(io) > e||/||^- We thus need to choose ICs which 
satisfy the inequality 5 > (K + 1)(1 - e a ) A ' _1 > (K + l)p K ~ 1 . For K > 3, we have K := 
K — 1 > (K + l)/2, and hence it suffices to have 5 > 2iT(l — e a ) K , or simply, 

~ ln(K) ln(2/<J) 

if> — y —+ y ' ' , (4.21) 

where we used that | ln(l - e a )\ > e a , e G (0,1). Let 9 > and K > 21n(2/<5)C e /e Q+e , for 
some C e > 1 (to be specified)^ Then, since K > 2ln(2/5)C e /e a+e > 21n(2/cT)/e Q , it follows 
that (|OTjl holds if K > 2ln(K)/e a . Since K a ^ a+e ^ > {2C e ) a/{a+e) /e Q , we get that (jMp) 
holds if 

> ^° /( ° +g) 21n( ^) > ^1 orif (2C e r/( a+e )^/( Q+e )>21n(^). 
~ (2C ) a /( Q+e ) ~~ e a v ' ~ 



The last is equivalent to 

u 1 > ln(-u), where u = K 



2f)/(a + 9)Q-a/(°' + 8) 



and 7 = (0/(a + #))2~ 9 /( Q + e )c^ /(Q+e) . We have that ^ > ln(u), u > 1, for all 7 > 1/e, and 
thus for C# we obtain: 

7S _^_ 2 -,/ ( . + „ c * + », al/e or Ce >_^_ (l + -)«/■ + 1 , (4 . 22) 

where we add 1 in the last relation to ensure that Cg > 1. □ 
Proof of Theorem 14.21 We will first show that 

P{7(*o) = /(io)} > 1 - 2<5, (4.23) 

where /(io) is defined as /(io) but based on ira/y independent variables Zj(i) which satisfy 
P~7j) . Recall that Relation (|47f]) holds if ([43]) holds. Choose 7 > and L so that F(B) >1-S, 
where B denotes the event {Condition (|4.5p holds}. As in the proof of Theorem 14. 11 

{f(io) = /(to)} = {/(io) > mm ^(io)- 1 V ¥io /(i)^(i)}. 

Observe that, since B holds, by (|4.7p . Vj^j /(i)Zj(i) > Vj^j /(i)Z J (i)+/3sj(io), where s/(io) := 
maxj^ io /(i). Thus, by using also that Zj(io) > Zj(jo) — j3, we get 

{/(io) = /(io)} ^ ^ n {/(i ) > ^^(^(io) - /3)" 1 ( V ¥io /(i)^-(i) + /%(i )) }=:BnC. 

(4.24) 
Since P(S) > 1 - 5 by Relation (|4.24[) it follows that, to establish (|4.23p . it is enough to show 
that P(C) > 1 — 5, where C denotes the second event in the right-hand side of (J4.24J) . Indeed, 
P{/(i ) = /(i )} > f(B n C) > 1 - P(B') - P(C") > 1 - 25. The event C, however, involves 
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ideal precision Frechet random variables where their corresponding Uj(i)'s are not bound to 
satisfy Condition (|4.5p . We can therefore manipulate C as in the proof of Theorem 14. II (i). 

Since V i ^ /(i)Z J (i) = c/(i )Zi(2), where c/(i ) = (Ei^ f(i) a ) 1,a , by independence: 



K 



P(C) > l-¥{(Z l (l)-f3)f(i )<c f (i )Z l (2) + f3s f (i )} 

= l-(Eexp{-(? / (i )Zi(2)+/3(l + ? / (io)))~ Q })^, 

where c/(io) = Cf(io)/ f(io) and S/(«o) — s /(^o)//(*o)- We now bound above the expectation in 
the last relation. Note that Cf(io)z + (3(1 + S/(«o)) < 2c/(*o)2j f° r an 2 > /5(1 + s"/(*o))/2/(io)- 
Thus, by (|1.5|) and as in Relation (|4.3U|) below, 

l-P(C) < (p{Z 1 (2)</3(l + ? / (zo))/^(io)} + IEexp{-(2? / (io)Z 1 (2))- a })^ 

where p(/3) = P{Zi(2) < 0(1 +5>(»o))/c/(io)} = exp{-(/3(l +?/(io))/c/(io))- Q }- 

Let now K be such that (1 - e Q )^ < (5, that is, K = 0(ln(l/5)/e a ). Then, in view of 
flEED, P(C) > 1 - 6, provided that 

(2c/(zoJ) a + 1 
or, equivalently, if f (i ) a / (f (i ) a + 2 a £^ io /(*o) a ) > (e Q +p(/3)). The last inequality holds if 

f(i )>2(e a + P m 1/a \\f\U a - 

Thus, to prove K7M . it remains to show that (e a + p(P)) l/a < (1 + C Q )e, if gSD holds 
and /3 < e a /(D + 1), where /3 is the "precision" parameter in (|4.6p . We have that (1 + 
5/(*o))/c/(io) < 1 + f{io)/cf(io) <1 + D, and thus 

pG9) <$«(£(£> + l))<*«(e°% 

since /3(L> + 1) < e a . One can show that $„(x) > 0, Vx € (0, (a/ (a + l)) 1/a ), and hence 
$a(x) < Ki 1 + !/«)* = CaX, Vx € (0,(a/(a + l)) 1/a )- Hence, p(/3) < C Q e a , for all e G 
(0, (a/(a + 1)) 1/q2 ), which implies (|423jl . 

Now, consider a point estimator f(io), defined as f(io), but with Zj(i) replaced by pseudo- 
random variables, with the same precision (i.e. taking values in the set Vl = {p/q, P,Q = 
0, 1, ... , L}). We will argue that a pseudo-random number generator exists so that 

Hfik) = f(io)} >1-S, (4.26) 

for some f(io) based on independent Zj(i)'s. This, in view of (I4.23p . would imply (I4.9p . 

We first need (|4.5p to hold with probability at least (1 — 5) with 7 > such that /3 = 
0(l/L^ l+l / a ) < e a /D. This can be achieved by taking L = 0((DNK/Se) ^). Now, to 
ensure that (|4.23p holds, it suffices to take K = 0(ln(l/5)/e a ). Therefore, one needs log 2 (L) = 
0(ln(DN/5e)) bits to represent each Zj(i). 
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Hence our algorithm uses 0(log 2 (L)Nln(l/5)/e a ) random bits. As in Indyk (2000), 
by using the results of Nisan (1990), for each j, 1 < j < K, one can generate 
pseudo-random [/j(i)'s, which are "very close" to some independent £/j(z)'s by using only 
0(\og 2 (L) \og 2 (N/5) ln(l/5)/e a ) truly random seeds. These Uj(i)'s would "fool" our algo- 
rithm with probability at least (1 — 5) since it uses only 0(log 2 (L)ln(l/5)/e a ) space for 
computations with random bits. That is, one has (|4.26p . In summary, we need to store 
0(K\og 2 (L)log 2 (N/S)) = 0(log 2 (DN/5e)log 2 (N/5)log 2 (l/5)/e a ) bits, needed primarily for 
the truly random seeds, and to perform about 0(Klog 2 (L)) = 0(\og 2 {DN/5e)\og 2 (l/5)/e a ) 
of bit-wise operations per each stream item, in order to maintain the sketch. □ 

PROOF of Theorem I4.31 Let, as in (|4.1|) . gj(io) ■= Vif(i)Zj(i)/Zj(io) =: f(io)Vt;j(io), where 
6j(*o) '■=^Vi&of(i)Zj(i)/Zj(io), j = 1,...,K. Since {criterion(i ) = 1} = {f(io) V£ (2 )(io) = 
/(*o) V^(i)(zo)}, it follows that 

{/(«o) + f(k)} n {criterion (i ) = 1} C {6(2) (*o) = 6(1) (*o)}, 

where 6(1) (*o) < 6(2) (*o) < • • • is the ordered sample of £,j(io), j = 1, • • • , K. 
Therefore, the probability in (|4.10p is bounded above by: 

P{6(2)(«o) = 6(1) (*o)} < P{6n = 6j 2 ) for some ii / J2, ii, J2 = 1, • • • ,K} 

We now focus on bounding the last probability. Since the 6j(*o)' s are independent and discrete 
random variables, we have 

nUio)=Uio)}= E IP{6i(*o) = x} 2 . (4.28) 

s: x is an atom 

Let r] > and observe that (z — /?) > (1 — rj)z and (z + j3) < (1 + rj)z, for all z > fi/rj. Thus, 
in view of (|4.7p . 

7/,.w w (Zi(i) + /3) (1 + t?) Zi(i) .-«./.N^ fl/ w - 

(^iN)-P) (l-r/)Zi(z ) 

Since the Zi(z)'s are ideal precision, independent and a— Frechet, V^ / (z) d_\ Z\ (i)/Z\ (io) = 
c f (i )^Z 1 (2)/Z 1 (l), where c f (i ) = (£^ io /« Q ) 1/a . Therefore, 

fc(*o) < c /fa) |^yf|^ =: t, if ^(i) > P/v, Vi, 

d ~ d 

where < denotes dominance in distribution. Similarly 6i(*o) ^ 6*) where 6* := 

c / (io)g^Z 1 (2)/Z 1 (l). Thus, 

P{6i(*o) = 4 < P({6i(*o) = ^} n {Zi(i) > /3/r/, Vi}) + P{Zi(i) £ /3/r/, for some i} 
< F f .(s) - F c .(x) + (1 - F{^i(l) > /5//), 
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where the last inequality follows from the fact that £* and £* have continuous cumulative 
distribution functions F^(x) := IP{£* < x}. 
Thus, for the probability in (|4.28p . we get: 

P{£(*o) = 6(*o)} < sup(F 6 (x) - F e (x)) + (1 - (1 - expl-^/r?)-"})^). (4.29) 

By taking /3/r/ < (l/\n(NK 2 /5)) 1 ' a , we can make the second term in the right-hand side 
of (|4.29p smaller than 5/K 2 (Lemma 14. 2p . Indeed, the second term is a monotone increasing 
function of (/3/ry). Hence by setting e := (1/ ln(NK 2 /5)) l ' a the upper bound in (|4.3ip becomes 
N exp{-(l/[ln{NK 2 /S)] 1 ^)^} = iVexp{- ln(NK 2 /5)} = 5/K 2 . Also, by Lemma[OJ the 
first term in the right-hand side of (|4.29|) . is bounded above by o2 4Q+3 ry, for all rj G (0, 1/2). 
Thus, in view of (|4.27j) . the probability in (|4.10p is bounded above by 0(K 2 r\) + 5/2, which 
can be made smaller than 5 by taking rj = 0(5/ K 2 ) and /3/rj = 0((l/ hx(N K 2 / 5)) l l a ). This 
implies that (3 = 0(5/(K 2 [ln(NK 2 /5)} l / a )) would ensure that (jlH)]) holds. Observe that the 
constant in the last O— bound does not depend on the signal /. □ 

Auxiliary lemmas 
Lemma 4.1 Let £ and r\ be independent, standard a— Frechet variables. Then, for all x > 0, 

1 



F{£/r) < x} 



x~ a + 1 
Proof: By independence, and in view of (II. 5p . we have: 



oo rl 



p{£/ ?? < x } = Eexp{-(nx)- a } = / e -»- a * -e ' de~ v ~ a = / u x ~ a du = — -. (4.30) 

Here, we used the change of variables u := e~ y . D 

Lemma 4.2 For all e G (0, 1) and N > 1, we have 

1 - (1 - exp{-(e)~ a }) N < Ne~ e ~ a . (4.31) 

Proof: We have that (1 - x) N > 1 - Nx, for all x G (0,1). Thus, for all x G (0,1), 
1 — (1 — x) < 1 — (1 — iVx) = iVx and by setting x := e~ e Q , we obtain (|4.3ip . □ 

Lemma 4.3 Let £* = c^^Z'/Z" and £* = c^^Z'/Z", for some c > and n G (0, 1/2), 
where Z' and Z" are independent standard a— Frechet variables. Then, we have 

sup \F^(x) - Ff(x)\ < a2 4Q+3 r ? , V? ? G (0, 1/2). 

x>0 

where F^(x) := P{£* < x} and F e (x) : = P{£* < x}. 
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Proof: We have that 

AF := F^(x) - F e (x) = P{Z'/Z" < C*x} - F{Z'/Z" < C*x}, 
where C* = (1 + ff}/c{l — rj) and C* = (1 — rf)/c{l + rj). By Lemma l4"7l) we have that 



(l + r]) a x a J \(l-r]) a x 

where ip(y) = l/(c~ a y + 1). By the mean value theorem, since (^'(y)! = c ~ a /( c ~ a y + I) 2 is 
monotone decreasing in y > 0, the last expression is bounded above by: 



(1 + r]) a x a (1 - r]) a x a {l + ri) a x< 

c~ a (l + r]) 2a x a / {I + i]) 2a - (1 - 1]) 2a 



(c- a (l-r/) a + (l + rj) a x a ) a \ (l-i] 2 ) a 

(l +7/ )° / (l + 7? )2 Q _(l_ f? )2a , ^ (1 + fy) 2 " - (1 - T?) 2 " 

: 2(l-7/) Q V (l-? ? 2 ) a / 2(1 -? ? ) 2 " 



where the last inequality follows from the fact that ab/(a + b) 2 < 1/2, a, b € R, with a := 
c~ a (l — r/) a and 6 = (1 + i]) a x a . By using the mean value theorem again, we obtain that, for 
some € [— 1, 1], the right-hand side of (|4,32p is bounded above by 

^nty" 1 ^ ^ «2 2 «+ 2 +l 2a - 1 lr ? < «2 4a+3 r ? , 
for allr/e (0,1/2). D 
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